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Mature nonstructural protein-15 (nspl5) from the severe acute respiratory syndrome coronavirus (SARS- 
CoV) contains a novel uridylate-specific Mn 2+ -dependent endoribonuclease (NendoU). Structure studies of the 
full-length form of the obligate hexameric enzyme from two CoVs, SARS-CoV and murine hepatitis virus, and 
its monomeric homologue, XendoU from Xenopus laevis, combined with mutagenesis studies have implicated 
several residues in enzymatic activity and the N-terminal domain as the major determinant of hexamerization. 
However, the tight link between hexamerization and enzyme activity in NendoUs has remained an enigma. 

Here, we report the structure of a trimmed, monomeric form of SARS-CoV nspl5 (residues 28 to 335) 
determined to a resolution of 2.9 A. The catalytic loop (residues 234 to 249) with its two reactive histidines (His 
234 and His 249) is dramatically flipped by ~120° into the active site cleft. Furthermore, the catalytic 
nucleophile Lys 289 points in a diametrically opposite direction, a consequence of an outward displacement of 
the supporting loop (residues 276 to 295). In the full-length hexameric forms, these two loops are packed 
against each other and are stabilized by intimate intersubunit interactions. Our results support the hypothesis 
that absence of an adjacent monomer due to deletion of the hexamerization domain is the most likely cause for 
disruption of the active site, offering a structural basis for why only the hexameric form of this enzyme is active. 


Nidoviruses are enveloped, positive-stranded RNA viruses 
consisting of the families Coronaviridae, Arteriviridae, and 
Roniviridae (5, 34). These viruses have large genomes, and 
their gene expression is controlled by a complex and poorly 
understood membrane-anchored replicase/transcriptase com¬ 
plex (12, 33). Components of this complex originate from 
the large 5'-proximal open reading frame la (ORFla) and 
ORFlb, which span two-thirds of the genome. ORFla is trans¬ 
lated into polyprotein la (ppla), while polyprotein lab (pplab) is 
formed when — 1 ribosomal frameshifting just upstream of the 
ORFla stop codon causes read-through into ORFlb. In the 
severe acute respiratory syndrome coronavirus (SARS-CoV), 
these polyproteins are proteolytically processed by two virus- 
encoded proteases (a main protease and a papain-like pro¬ 
tease) into 16 mature nonstructural proteins (nspl to -16) (9, 
23, 29). These proteins act in concert to replicate the viral 
genome and transcribe a nested set of eight subgenomic 
mRNAs, which are then translated to produce the structural 
and accessory proteins. One of the RNA-processing enzymes 
in the viral replicase-transcriptase is a uridylate-specific en¬ 
doribonuclease (NendoU), which is considered a genetic 
marker for the nidoviruses, discriminating them from all other 
RNA virus families (14). This protein is a distant homologue of 
an endoribonuclease, XendoU from Xenopus laevis, that is 
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involved in processing intron-encoded box C/D U16 small nu¬ 
cleolar RNA (10, 17) and shares most of its catalytic determi¬ 
nants. 

nspl5 of SARS-CoV is a 346-residue polypeptide that re¬ 
sults from the cleavage of pplab at sites 6427 RLQ ( SLE 6432 
and 6773 KLQ [ ASQ 6778 by the main protease (nsp5). Several 
recent studies have focused on structural and functional char¬ 
acterization of coronavirus nspl5 due to its potential impor¬ 
tance as a drug target. Mutagenic inactivation of this enzyme 
renders the virus nonviable, as demonstrated with human CoV 
229E (14) and equine arteritis virus (EAV; reference 12). 
nspl5 preferentially cleaves the 3' end of uridylates of RNA at 
GUU or GU sequences to produce molecules with 2'-3' cyclic 
phosphate ends (2, 14). It acts on both double-stranded RNA 
and single-stranded RNA (ssRNA) but with different prefer¬ 
ences (2, 14). Mn 2+ ions are required for optimal activity (2, 
14), with the ion apparently binding weakly to the protein but 
producing significant conformational changes upon binding 
(2). The recent structures of full-length nspl5 from SARS- 
CoV (28) and murine hepatitis virus JF1M strain (MHV-JF1M) 
(38) and its eukaryotic homolog, XendoU from Xenopus 
laevis (27), have provided first structural and mechanistic de¬ 
scriptions of this new enzyme family. Its catalytic center resem¬ 
bles the active site of an unrelated nuclease, RNase A (28). 

The biological unit of CoV nspl5 is a hexamer (13) (see Fig. 
3a), with its six (potential) active sites distributed along its 
periphery away from intersubunit interfaces. nspl5 exists in 
solution in an equilibrium between monomers and hexamers 
(and sometimes other oligomeric states), but only the hexa¬ 
meric form has been reported to be enzymatically active (13, 
38). The link between its hexameric state and its enzymatic 
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activity has not been clear. However, based on our structure of 
a trimmed monomeric form of SARS-CoV nspl5 (lacking 28 
N-terminal and 11 C-terminal residues), we demonstrate evi¬ 
dence for the structural basis underlying this. Examination of 
the active site strongly suggests that the absence of monomer- 
monomer interactions within the hexamer destabilizes and sig¬ 
nificantly displaces two loops (residues 234 to 249 and 276 to 
295) in the catalytic domain, resulting in the destruction of the 
active site in the isolated monomer. 

MATERIALS AND METHODS 

Construct design and cloning. The sequence encoding the 346-residue nspl5 
(NP_828872.1; gi:29837507) corresponds to nucleotides 19551 to 20588 in the 
genome of the SARS-CoV Tor2 strain (nomenclature of nsp’s is as per the work 
of Snijder and coworkers [33]). A construct corresponding to residues 28 to 346 
of nspl5 was amplified by PCR from genomic cDNA of the SARS-CoV Tor2 
strain by use of Taq polymerase and primer pairs containing the predicted 5' and 
3' ends (forward, 5' -ATGAATAATGCTGTTTACACAAAGGTA GATGGTA 
GGGCCGGCCGGG-3'; reverse, 5' -TTGTAGTTTTGGGTAGAAGGTTT CA 
ACATGTCCCCGGCCGGCCCTA-3'). The PCR product was cloned between 
Fsel and Pmll sites into the expression vector pMHIF, a derivative of pBAD 
(Invitrogen). Expression in pMHIF is driven by the araBAD promoter, and the 
recombinant protein is produced with a short, noncleavable N-terminal 
Thio 6 His 6 tag (MGSDKIHHHHHH). As part of the crystallization and diffrac¬ 
tion optimization strategy, C-terminal truncation mutants were generated by 
insertion of appropriate stop codons into the coding sequence by site-directed 
mutagenesis using a Stratagene QuikChange kit. The construct described in this 
paper corresponds to residues 28 to 335. 

Expression and purification. A sequence-verified clone was transformed into 
ToplO cells (Invitrogen). An overnight culture from a fresh transformant was 
used to inoculate flasks of 2XYT-ampicillin medium. The culture was grown at 
37°C to an optical density at 600 nm of 0.6, induced with 0.2% (wt/vol) l- 
arabinose, and further grown at 14°C overnight. The cells were harvested by 
centrifugation and lysed by sonication in buffer containing 50 mM Tris-HCl, pH 
8.0,300 mM NaCl, 10% glycerol, 0.5 mg/ml lysozyme, 100 (jul/liter benzonase, and 
EDTA-free protease inhibitor (one tablet per 50 ml buffer; Roche). The lysate 
was clarified by ultracentrifugation at 45,000 rpm for 20 min at 4°C, and the 
soluble fraction was applied onto a metal chelate column (Talon resin charged 
with cobalt; Clontech). The column was washed with 20 mM Tris-HCl, pH 7.8, 
300 mM NaCl, 10% glycerol, 5 mM imidazole and eluted with 25 mM Tris-HCl, 
pH 7.8, 300 mM NaCl, 150 mM imidazole. The eluate was then fractionated by 
anion exchange on a Poros HQ column using a linear gradient of NaCl (0 to 1 
M) in 25 mM Tris-HCl, pH 8.0. The pure fractions were pooled and concentrated 
to 1.8 mM. The protein was either flash frozen in liquid nitrogen for later use or 
used immediately for crystallization trials. 

Crystallization and data collection. Crystals were grown by the nanovolume 
sitting drop method (30) using drops consisting of 100 nl 1.8 mM protein and 100 
nl crystallant. Large (~300-|jLm by ~20-(jim by ~200-|xm) crystals of hexagonal 
morphology grew typically overnight in 0.2 M sodium bromide, 0.1 M sodium 
acetate, pH 5.5, and 25% polyethylene glycol (PEG) 2000MME. These were 
cryoprotected in a solution containing mother liquor and 15% glycerol and flash 
frozen in liquid nitrogen. While optimizing diffraction, crystals were screened at 
the Stanford Synchrotron Radiation Laboratory beamlines 1-5, 11-1, 11-3, and 
9-1 and the General Medicine and Cancer Institutes Collaborative Access Team 
(GM/CA-CAT) beamline 23-ID of the Advanced Photon Source (Argonne, IL) 
by use of Blu-Ice (19). The complete 2.9-A native data set used for phasing and 
refinement was collected at the GM/CA-CAT. The reflections were indexed in 
the primitive hexagonal lattice scaled in P3 using HKL2000 (21). Subsequent 
molecular replacement attempts (see below) revealed the actual space group to 
be P3 1 . 

Phasing and refinement. Initial phases were obtained by molecular replace¬ 
ment using the full atom monomer structure of full-length SARS-CoV nspl5 
(kindly provided by Bruno Canard) with the program Phaser (26) within CCP4 
(6) using data from 50.0 to 2.9 A. Rigid-body refinement using Refmac5 (20) 
revealed a clearly interpretable electron density map. Phases and the model itself 
were further improved by one round of Arp/wARP (16). Initial model building 
was guided by a composite omit map calculated using CNS (4) to minimize 
model bias. Subsequent improvement of the coordinates was achieved by manual 
model building in Coot (8) alternating with restrained refinement using Ref- 
mac5. Though noncrystallographic symmetry was identified between the four 


TABLE 1. Data collection, magnetic resonance, and refinement 
statistics for nsp!5 


Parameter" 

Finding or value 6 

Data collection 

Lattice 

Primitive hexagonal 

Space group 

P3 1 

Unit cell 

a (90°) 

98.974 A 

(3 (90°) 

98.974 A 

7 (120°) 

214.926 A 

Wavelength 

0.9793 A 

Resolution range 

85-2.9 (3.0-2.90) 

Total observations 

581,802 

Unique reflections 

52,135 (3,114) 

Completeness 

98.1% (89.9%) 

Redundancy 

2.9 (1.9) 

Mean I/a value 

14.4 (1.44) 

R sym on I value 

0.085 (0.466) 

Refinement statistic 

Resolution range 

50-2.90 A 

Rciyst value 

0.251 (0.341) 

R free value 

0.292 (0.429) 

Bond length RMSD 

0.015 A 

Bond angle RMSD 

1.98° 

Average isotropic B value 

43.1 A 2 


a RMSD, root mean square deviation. Equations: R sym = 2hkl[(X/ Ij — (/))/ 
Xj Ij ]; R wor k = Xhkl Fo — Fc [Xhkl Fo, where Fo and Fc are the observed and 
calculated structure factors. Five percent of randomly chosen reflections were 
used to calculate R free . 

b Values in parentheses are for data corresponding to the outermost reflection 
shell. 


monomers, they were not used during refinement due to subtle differences 
observed between them in the asymmetric unit. The final model statistics, vali¬ 
dation, and stereochemical quality are summarized in Table 1. 

Electrophoretic mobility shift assay. Electrophoretic mobility shift assays were 
used to assess RNA binding by the monomeric nspl5 construct. A short, 5'- 
fluoresceinated RNA oligomer (5'-GGUUU-3') was designed to mimic the pre¬ 
ferred recognition site of nspl5 (14). The positive control, purified SARS-CoV 
nsplO, was produced as described previously (15). Protein samples were mixed 
with 0.8 p,g of RNA in assay buffer containing 150 mM NaCl, 50 mM Tris-HCl, 
and 5 mM CaCl 2 , pH 8.0. Samples were incubated at 37°C for 1 h and analyzed 
by native electrophoresis on 6% acrylamide DNA retardation gels (Invitrogen). 
Fluoresceinated RNA was visualized using a UV light source equipped with a 
digital camera. Protein was detected by SYPRO-ruby poststain according to the 
manufacturer’s protocol (Molecular Probes). Densitometric analysis was per¬ 
formed using a flatbed scanner with ImageJ software (NIH). The amount of 
bound protein was calculated relative to the maximum binding observed in each 
experiment. K d values were determined from the midpoints of the fitted titration 
data as described elsewhere (32). 

PFO-PAGE. Perfluoro-octanoic acid polyacrylamide gel electrophoresis 
(PFO-PAGE) was performed according to the method of Ramjeesingh et al. (25) 
to assess protein stoichiometry. Purified protein samples were mixed 1:1 with 
PFO loading buffer containing 8% (wt/vol) PFO, 100 mM Tris base, 20% (vol/ 
vol) glycerol, and 0.05% (wt/vol) orange G. Samples were loaded onto precast 4 
to 20% Tris-glycine gels, and electrophoresis was performed with a standard 
Tris-glycine running buffer to which 0.5% (wt/vol) PFO was added. Gels were 
stained with SYPRO-ruby according to the manufacturer’s instructions and pho¬ 
tographed with a digital camera attached to a UV light source. 

Protein structure accession number. The structure factors and coordinates of 
the structure have been deposited in the Protein Data Bank (PDB) with acces¬ 
sion number 20ZK. 

RESULTS AND DISCUSSION 

Expression and crystallization of monomeric nspl5 con¬ 
struct. We obtained high expression (final yield, 35 mg/liter) of 
a pMHIF construct corresponding to nspl5 residues 28 to 346 
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FIG. 1. Electrophoretic mobility shift assays. Monomeric SARS-CoV nspl5 (A and B) and native nsplO (C and D) were incubated with 
fluoresceinated ssRNA substrate, and complexes were separated by native electrophoresis on 6% polyacrylamide gels. Fluoresceinated RNA was 
visualized under UV illumination (A and C), and protein was visualized similarly after SYPRO-ruby poststaining (B and D). Lanes: P, protein only; 
0, RNA only; 1 to 7, gradient of protein with a fixed amount of RNA. Free RNA is indicated with a small filled triangle, protein is indicated with 
a large filled triangle, and protein-RNA complexes are indicated with open triangles. (E) nsplO was used as a positive control; its binding affinity 
to the fluoresceinated RNA substrate is shown. The dotted lines indicate the concentration of nsplO that correlates to 50% binding. (F) PFO- 
PAGE gel indicating that trimmed nspl5 exists primarily as a monomer and as a dimer; although a variety of larger forms exists, this construct 
appears to be specifically defective in trimer formation. 


in ToplO cells. This construct lacks the first 28 residues (in 
particular Glu 3) of the N-terminal domain, which are impor¬ 
tant for oligomerization (13). It eluted predominantly as a 
monomer in gel filtration studies (data not shown), although 
small amounts of higher oligomeric forms were detected after 
PFO-PAGE analysis (Fig. IF). 

This construct crystallized overnight in several PEG-contain- 
ing conditions but diffracted poorly (~6 A). We expressed 
several C-terminal truncation mutants, deleting from 5 to 19 
residues, all of which had comparably high expression levels 
and were monomeric in solution. Among these, a construct 
from residues 28 to 335 (described in the rest of this paper) 
crystallized in a non-PEG condition, which upon optimization 
with additive screens yielded crystals that diffracted to 2.9 A. 
Attempts to improve diffraction using annealing or dehydra¬ 
tion (including the procedure described for MHV nspl5 in 
reference 38) were not successful. 

It is interesting that while our enzymatically inactive (see 


below) truncation mutant of nspl5 expressed in high amounts, 
the full-length wild-type (WT) construct expressed only weakly 
(data not shown). Ricagno et al. also reported low expression 
for WT SARS-CoV nspl5 (28). These results taken together 
reinforces the suggestion by Xu et al. that expression of WT 
MFIV-JHM nspl5 may be toxic to Escherichia coli, causing 
slow cell growth and low protein yields (38). An incidental 
mutation, F307L, and active site mutations FI262S, H277S, and 
K317S—all detrimental to enzymatic activity—expressed at 
much higher levels. Even in EAV NendoU, only the inactive 
D3014A mutant could be expressed in E. coli, whereas WT 
expression was not tolerated (22). It is likely that functional 
nspl5 may act on its own (and also cellular) mRNA, limiting its 
expression and impeding normal cell growth. 

Functional characterization of monomeric nspl5. Bhardwaj 
et al. reported that only the hexameric form of full-length WT 
nspl5 binds RNA and that Mn 2+ ions increase the affinity of 
interaction (1). We therefore assessed the ability of mono- 
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FIG. 2. (a) Contents of the asymmetric unit. The four monomers (designated A, B, C, and D) are colored blue, green, cyan, and orange, 
respectively, (b) The three-domain topology of the truncated SARS-CoV nspl5 monomer. Color is ramped from the N terminus (blue) to the C 
terminus (red). Secondary structural elements are numbered (31 to (313 for |3-strands, while a-helices are numbered al to a9. (c) Superimposition 
of the four monomers of the asymmetric unit. Subunits A/B, which form a pair, are colored blue and red, while the other two subunits, C and D, 
are colored green and magenta, respectively. 


meric nspl5 construct 28-346 and of seven additional trunca¬ 
tion constructs (lacking 5, 7, 9, 11,13,15,17, and 19 C-terminal 
residues, respectively) to bind and cleave ssRNA. Electro¬ 
phoretic mobility shift assays did not reveal an interaction 
between any form of monomeric nspl5 and fluoresceinated 
ssRNA substrates (Fig. 1), compatible with previous reports (1, 
2). These results confirmed that monomeric nspl5 lacks RNA 
binding activity, as previously reported (1, 2). We do not ob¬ 
serve any effect of Mn 2+ ions on the affinity of monomeric 
nspl5 for RNA (data not shown). The modulation of RNA 
binding by Mn 2+ ions is therefore probably a feature only of 
the intact nspl5 hexamer. XendoU, a structural and functional 
homologue of viral NendoU from Xenopus laevis, is biologi¬ 
cally active as a monomer and does bind RNA in the absence 
of Mn 2+ ions (10). However, changes to the affinity of this 
interaction in the presence of Mn 2+ ions have not been inves¬ 
tigated. 


Description of the asymmetric unit. The structure of SARS- 
CoV nspl5 reported here covers residues 28 to 335 of nspl5 
from the Tor2 strain (Fig. 2). Data collection and refinement 
statistics are summarized in Table 1. Cell content analysis 
yields a Matthews coefficient of 5.06 and a large solvent con¬ 
tent of 74%, which might explain the poor diffraction of these 
otherwise large (0.3-mm) crystals. The final refined model con¬ 
tains four monomers arranged coaxially in the asymmetric unit 
of the hexagonal P3 1 lattice (Fig. 2a). The four monomers exist 
as two pairs, A/B and C/D. Monomers A and B are virtually 
identical to each other but differ from monomers C and D (Fig. 
2c). The N terminal (3-hairpin is tilted by ~45° in monomers 
A/B in comparison to the pair C/D. None of the intersubunit 
interactions seen between monomers within the asymmetric 
unit as well as with their symmetry mates recapitulate those 
observed for the functionally relevant hexamer of full-length 
nspl5 of SARS-CoV (Fig. 3a) or MHV. 
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FIG. 3. (a) Surface representation of hexameric full-length (FL) nspl5 (PDB code 2H85). The individual subunits are colored and marked A 
to F. The N-terminal 28 residues of the four monomers A, C, D, and F are shown in red and labeled A N , C N , D N , and F N , respectively, (b) 
Superposition of truncated (blue) and full-length (yellow) SARS-CoV nspl5 with MHV nspl5 (cyan). The genomic packaging signal of MHV is 
colored red, and the disordered region is shown as a dotted line. The sequence alignment of the packaging signal of MHV is shown below the 
ribbon diagram, with the implicated region (P192 - A215) indicated with a red line. Sequence IDs are gi:29837507 for SARS-CoV and gi:37999877 
for MHV. (c) Conformational rearrangement of the active site loop and supporting loop in the truncated and full-length forms of nspl5. Truncated 
monomeric nspl5 is shown in green, while that of the hexameric WT is in light brown. The “supporting loop” (residues 276 to 295) and the “active 
site loop” (residues 234 to 249) are colored red in monomeric nspl5 and blue in full-length nspl5. Equivalent residues in the active sites of the 
two structures are labeled (regular font in monomeric nspl5 and in italics for the full-length enzyme), (d) Electron density map (2Fo-Fc) contoured 
at 1.1 a around key residues in the active site of monomeric nsp!5. 
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FIG. 3— Continued. 


As anticipated, each monomer adopts an overall topology 
similar to that of the full-length WT nspl5, with three distinct 
structural domains (Fig. 2b). The first domain is incomplete in 
our construct and consists of a (3-hairpin, which is displaced by 


about 90° in monomers C/D with respect to A/B (Fig. 2c). A 
surface representation of the full-length hexameric SARS-CoV 
nspl5 is shown in Fig. 3a, highlighting the role of the N ter¬ 
minal residues (A N to D N ) in stabilizing the hexamer. A struc- 
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ture superposition of MHV nspl5 and the two forms of SARS- 
CoV nspl5 is shown in Fig. 3b. The central domain of 
truncated nspl5 superimposes nearly perfectly on the two full- 
length structures, but there are several important differences in 
the catalytic domain. A superposition of one of the truncated 
monomers on the structure of the full-length nspl5 is shown in 
Fig. 3c. Polypeptide segments corresponding to residues 240 to 
247 and 282 to 286 are disordered only in the C and D mono¬ 
mers and are clearly ordered in A/B. In monomers A/B, the 
loop consisting of residues 308 to 317 is bent outwards in a 
direction away from the active site by about 70° with respect to 
the corresponding region in C/D. The importance of these 
differences is discussed later. 

Structural basis for the functional hexamer. Most striking of 
all is the comparison between structures of full-length SARS- 
CoV nspl5 and our trimmed construct, particularly in the 
catalytic subunit (Fig. 3c). In monomers A (and B) in the 
asymmetric unit of our construct, the loop consisting of resi¬ 
dues 234 to 249 (referred to here as the “active site loop”) 
spanning the two active site histidines (His 234 and His 249) is 
flipped by —120° into the active site cleft. Lys 289, another 
active site residue, points in a diametrically opposite direction 
(a consequence of a significant displacement of the loop from 
residues 276 to 295), making it very unlikely to be available for 
catalysis. Furthermore, it is precisely in each of these regions 
that significant differences were observed between monomers 
A/B and C/D in the asymmetric unit of our crystal structure. In 
C (and D), electron density for these loops is missing, indicat¬ 
ing that they are flexibly disordered. In the structures of the 
functional full-length nspl5 hexamer, the “active site loop” and 
the “supporting loop” are packed against each other and are 
stabilized by intimate interactions with residues in the adjacent 
monomer. Although our structure is of a nominal (2.9-A) res¬ 
olution, the electron density for these two loops is quite well 
ordered in two of the monomers and clearly traceable after 
residues were deleted in early stages of model improvement as 
well as in the composite omit map. The electron density 
around the active site residues is shown in Fig. 3d. Residues 
Asp239 - Gly246 and Leu248 in the active site loop interact 
with Ile280, Asp 282, Thr 285 - Lys289, Ser293, and Val294 in 
the “supporting loop.” The supporting loop, in turn, makes 
extensive interactions with the adjacent monomer (for example 
monomer A with monomer E in the hexamer; PDB 2H85). 
Specifically, Lys264, Glu266, Phe268 - Met271, Asn277, Phe279 - 
Ser288, Cys290, and Gly291 from the supporting loop interact 
with five sequentially distinct regions on the adjacent monomer 
(Val9 - Hisl4, Aspl6, Thr33, Val35, Val40 - Ile42, Arg61, 
Ile63, Tyr88, Vall62 - Vall65, Leul67 - Vall72), which form a 
contiguous patch at the intersubunit interface. In the structure 
of the truncated form described here, the absence of the ad¬ 
jacent monomer likely causes a peeling away of the “support¬ 
ing loop” from residues 276 to 295, causing an apparent inward 
collapse of the “active site loop” from residues 234 to 249 into 
the active site, thus possibly destroying it. Alternatively, just 
disorder in this region, as found in chains C and D in our 
crystal structure, may contribute to a destruction of a produc¬ 
tive active site conformation. In chains C and D, the “active 
site loop” and the “supporting loop” face the large solvent 
channels in the crystal, while they are stabilized in their “inac¬ 
tive” conformation by crystal contacts in chains A and B. 


J. Virol. 



FIG. 4. Superposition of XendoU structure (PDB ID 2C1W; 
green) onto the catalytic domain of SARS-CoV NendoU (PDB ID 
2H85; light pink). The two disordered loop regions in the XendoU 
structure are indicated by dotted lines. The secondary structures are 
numbered using the same scheme as for Fig. 2b. The active site loops 
of the two structures are highlighted in red. 


The eukaryotic structural homologue of NendoU from 
Xenopus laevis (referred to as XendoU in many studies) exists 
as a monomer and yet is enzymatically active (10). Structural 
comparison reveals that the single domain of XendoU corre¬ 
sponds to the catalytic domain of CoV nspl5. Most secondary 
structure elements of SARS-CoV nspl5 superimpose well with 
XendoU (Fig. 4). However, there are several additional struc¬ 
tural elements in XendoU which decorate the periphery of the 
molecule, for instance, the helix comprised of residues 10 to 21 
followed by a long hairpin loop held together by a short two- 
stranded (3-sheet (strand 1, residues 32 to 35; strand 2, residues 
60 to 62) and a short helical turn (residues 64 to 67). Two 
helices in the SARS-CoV structure (residues 206 to 213 and 
219 to 225) are extended in XendoU (residues 73 to 86 and 96 
to 111, respectively). These extensions with a longer connect¬ 
ing loop (10 residues in XendoU compared to 6 residues in 
SARS-CoV NendoU) act as one of the walls of the monomer, 
providing stability to the active site, possibly by preventing it 
from splaying open or collapsing. Two additional helices from 
residues 113 to 125 and from 133 to 144 provide additional 
undergirding to the molecule and play a role in stabilizing the 
active site loop by holding up the loop from residue 165 to 178. 
Further, residues 145 to 158 form a hairpin loop that also 
provides stability, holding up the loop consisting of residues 
165 to 178. These additional structural elements hold the cup¬ 
like molecule together, likely fulfilling the role of adjacent 
monomers in the CoV nspl5 hexamer. Hence, a comparison of 
our structure of the monomeric form of nspl5 with those of 
full-length CoV nspl5 and monomeric XendoU offers a likely 
structural basis for the catalytic activity of the intact CoV nspl5 
hexamer only. 

It is fair to assume that nspl5 exists as a monomer (perhaps 
as part of a long polyprotein precursor) before it assembles as 
a hexamer. One would therefore expect that in its polyprotein 
precursor, or when freshly released from it by the proteolytic 
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action of nsp5, nspl5 would bear the same conformationally 
inactive catalytic site—the default “off” position—in the ab¬ 
sence of the other “fortifying” monomers of the hexamer. 
Translation of additional monomers and subsequent hexamer- 
ization would then switch the protein to the “on” position. In 
vitro, nspl5 does occur in equilibrium between the hexamer 
form and lower oligomeric states (13); hence, disruption of the 
hexamer would be a conceivable means of regulating the ef¬ 
fective concentration of the enzyme, though there is not yet 
experimental evidence for this. It is worth noting that the pplb 
region containing nspl2 to -16 is accessed by only up to 30% of 
translating ribosomes (3). Since nspl5 is absolutely required 
for viral RNA synthesis (13, 22), a minimum of 18 translation 
events may be required before the phase of RNA synthesis 
involving nspl5 can effectively occur. Therefore, one would 
anticipate that a high level of protein synthesis is a prerequisite 
for the discontinuous phase of coronaviral RNA synthesis (31). 
This in turn raises the interesting implication that CoVs may 
be unusually dependent on translation compared to other vi¬ 
ruses. Whether this is indeed the case, however, remains to be 
experimentally ascertained. 

Structural basis for inhibitor design. The activity of nspl5 is 
critical for the viability of the virus, making it potentially an 
excellent drug target. As mentioned earlier, mutation of cata¬ 
lytic residues of the NendoUs in human CoV 229E (14) and 
EAV (22) inactivates the viruses. As described earlier, the 
monomeric form of nspl5 is inactive due to the loop consisting 
of residues 234 to 249 falling into the active site. Since this loop 
fits well into the active site, designed peptide sequences de¬ 
rived from this segment could also be reasonably expected to 
bind to the active site and could provide a good starting point 
for the design of inhibitory peptidomimetic molecules, an av¬ 
enue that is being actively pursued by our group. A second 
avenue for inhibitor design that is being explored is based on 
peptides or small molecules that inhibit oligomerization by 
disruption of the interface. These could be more desirable than 
active site inhibitors, which could potentially interfere with the 
similar active sites (such as that of RNase A) in the host cell. 
They would merely disrupt protein-protein interactions essen¬ 
tial for nspl5 hexamerization and hence the activity of the 
enzyme. 

The MHV genomic RNA packaging signal is embedded in 
the nspl5 ORF. A striking difference between the MHV and 
SARS-CoV nspl5 structures occurs in the region linking the 
catalytic and central domains (Fig. 3b). The MHV nspl5 in¬ 
terdomain linker is long and relatively flexible, as evidenced by 
the lack of observable electron density in the structure, in 
contrast to the short, structured SARS-CoV linker (28, 38). 
The MHV genomic RNA packaging signal, which functions at 
the level of RNA sequence and secondary structure, maps 
entirely within the extended MHV nspl5 interdomain linker 
(24). CoVs belonging to group 2a, which includes MHV, are all 
predicted to encode long nspl5 interdomain linkers with ele¬ 
vated proline and glycine content (25 to 33% P+G), and the 
genomic RNA packaging signal of bovine CoV maps to this 
region (7). In contrast, CoVs of groups 1, 2b, and 3 are pre¬ 
dicted to contain shorter nspl5 interdomain linkers of the type 
found in SARS-CoV. Phylogenetic and structural evidence 
therefore suggests that the nspl5 ORF contains both a likely 
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RNA packaging signal and a likely flexible interdomain protein 
loop in the group 2a CoVs only. 

Several instances of signals that function at the level of RNA 
structure but are encoded within ORFs are known. Examples 
for which the structure of the encoded protein is not known 
include the hepatitis delta virus packaging signal in the large 
delta antigen ORF (36), the tobacco mosaic virus packaging 
signal in the GP4 movement protein ORF (35), and the polio¬ 
virus cis -acting replication element within the 2C coding region 
(11). The protein regions encoded by these RNA elements are 
characterized by elevated proline-plus-glycine (19% P+G) and 
aspartic acid-plus-glutamic acid (16% D+E) content, consis¬ 
tent with local flexibility at the level of protein structure. 

Examples of long RNA signals within ORFs for which the 
structure of the encoded protein is known include the human 
rhinovirus cw-acting replication element (reference 18; PDB 
code 4RHV), black beetle virus and Nodamura virus packag¬ 
ing signals (reference 39; PDB codes 2BBV and 1NOV, re¬ 
spectively), and the MHV packaging signal (reference 37; PDB 
code 2GTH). These regions are also collectively enriched in 
proline and glycine (23% P+G) and adopt mainly randomly 
coiled conformations. As would be expected from the amino 
acid content, only 16% of the amino acid residues encoded by 
these RNA elements participate in a-helical and fl-strand sec¬ 
ondary structures in the crystallized proteins. Therefore, from 
the limited data available it would appear that long RNA 
signals located within protein-encoding regions generally en¬ 
code randomly coiled, proline- and glycine-rich protein loops. 

CONCLUSION 

Here we have presented the structure of a truncated, mo¬ 
nomeric form of SARS-CoV nspl5 that lacks the N terminal 
hexamerization domain. The resulting monomerization led to 
the collapse of the active site in the catalytic domain. This has 
provided a mechanistic explanation for the critical role of hexa¬ 
merization in the enzymatic activity of the full-length corona- 
viral nspl5. This study also presents a structural basis for the 
possible design of novel inhibitors that target the active site as 
well as interfere with hexamerization. Finally, we point out that 
the RNA packaging signals in group 2a CoVs (and several 
other viruses) map to unstructured protein loops, offering a 
means for independent adaptation of the RNA packaging sig¬ 
nal without materially altering the structure of the encoded 
protein. 
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